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Abstract — To scale up and down the resource usage of stake holders such asifirfcorners, the 
cloud computing environment is used. In this paper, we present a system tha^^s^virtualization 
technology to allocate data center resources dynamically based on applicpleN demands, the 
green computing is supported by optimizing the number of servers in^g^^V^e introduce the 
concept of "skewness" to measure the unevenness in the multidimensiQfi|^I^ource utilization of a 
server. By minimizing skewness, we can combine different types of worl^j^ds nicely and improve 
the overall utilization of server resources. We develop a set of hei^tWl^ tfiat prevent overload in the 
system effectively while saving energy used. Trace driven ^^A^fif on and experiment results 
demonstrate that our algorithm achieves good performance. 

Index Terms — Cloud computing, resource management, ^^)|ialization, green computing. 

1. Introdu 

There are great discussion how to move legacy apjaj^^ftons onto the cloud platform and here we study 
how a cloud service provider best can multiplex^i^>Artual resources onto the physical hardware. This 
is important because much of the toute^^ains in the cloud model come from multiplexing. 1 1 
is observed that in ma ny^^^is ti ng data centers the servers are 
underutilized due to o ve r ^^^vi s i o n ing for the peak demand, [i], [2]. The 

\S^r 




cloud model is expected to mal«^^tp^ractice unnecessary by offering automatic scale up and 
down in response to load \^riaN^n. Besides reducing the hardware cost, it also saves on 
electricity which contribute^^Tk significant portion of the operational expenses in large data 
centers. 



Virtual machine mM^rs (VMMs) like Xen provide a mechanism for mapping virtual 
machines (VMs) to^^^ical resources [3]. This mapping is largely hidden from the cloud users. 
It is up to the dJoiK^rovider to make sure the underlying physical machines (PMs) have sufficient 
re- sources tg^m^pt their needs. VM live migration technology makes it possible to change the 
mapping b^o^en VMs and PMs While applications are running [5], [6]. However, a policy issue 
remains^^j^w to decide the mapping adaptively so that the resource demands of VMs are met 
while l^i^pumber of PMs used is minimized. This is challenging when the resource needs of VMs 
ro-geneous due to the diverse set of applications they run and vary with time as the 
[oads grow and shrink. 



We aim to achieve two goals in our algorithm: 

• Avoiding overloading: The capacity o f a PM should be sufficient to satisfy the resource needs 
of all VMs running on it. Otherwise, the PM is overloaded and can lead to degraded 
performance of its VMs. 

• Green Computing: The number of PMs used should be minimized as long as they can still 
satisfy the needs of all VMs. Idle PMs can be turned off to save energy. 
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For overload avoidance, we should keep the utilization of PMs low to reduce the possibility of 
overload in case the resource needs of VMs increase later. 



For green computing, we should keep the utilization of PMs reasonably high to make efficient use 
of their energy. 

In this paper, we present the design and implementation of an automated resource management 
system that achieves a good balance between the two goals. We make the following contributions: 



• We develop a resource allocation system that can avoid overload in the system eff^i^ery 
while minimizing the number of servers used. 



We introduce the concept of "skewness" to measure the uneven utilization 
minimizing skewness, we can improve the overall utilization of server^Sl 



jg^Qrvev. By 
vl^tve face of 

multidimensional resource constraints. O > 

We design a load prediction algorithm that can capture the futuie^^l^ource usages of 
applications accurately without looking inside the VMs. The algorithnaVa^ capture the rising 
trend of resource usage patterns and help reduce the placement^Jrf^Jt significantly. 




eti«^ir4 provides simulation and 
^es related work and Section- 7 



The rest of the paper is organized as follows. Section-2 provides^^ff^Merview of our system and 
Section -3 describes our algorithm to predict resource usage. 
Section-5 presents experiment results, respectively. Section- 6 
concludes. 

2. System Ove^Vw 



V M Scheduler 



Predlctof | m \ Hotspot Solver | — m \ Coldj 




Migration List 



Figure. 1. System architecture. 

ll^^ecture of the system is presented in Fig. 1. Each PM runs the Xen hypervisor (VMM) 
dchjsupports a privileged domain o and one or more domain U [3]. Each VM in domain U 
enSlpsulates one or more applications such as Web server, remote desktop, DNS, Mail, Map/ 
Reduce, etc. We assume all PMs Share back end storage. 



The multiplexing of VMs to PMs is managed using the Usher framework [7] . The main logic of our 
system is implemented as a set of plug-ins to usher. Each node runs an Usher local node 
manager (LNM) on domain o which collects the usage statistics of resources for each VM on that 
node. The CPU and network usage can be calculated by monitoring the scheduling events in 
Xen. The memory usage with in a VM, however, is not visible to the hypervisor. One 
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approach is to infer memory shortage of a VM by observing its swap activities [8]. Unfortunately, 
the guest OS is required to install a separate swap partition. Furthermore, it may be too late to 
adjust the memory allocation by the time swapping occurs. Instead we implemented a working 
set prober (WS Prober) on each hypervisor to estimate the working set sizes of VMs 
running on it. We use the random page sampling technique as in the VM ware ESX Server [9] .The 
statistics collected at each PM are forwarded to the Usher central controller (Usher CTRL) 
where our VM scheduler runs. The VM Scheduler is invoked periodically and receives from the 
LNM the resource demand history of VMs, the capacity and the load history of PMs, and the 
current layout of VMs on PMs. The scheduler has several components. The predictor ^re/f^|ts 
the future resource demands of VMs and the future load of PMs based on past statisti%^y/Ve 
compute the load of a PM by aggregating the resource usage of its VMs. The LNM at ^ch^node 
first attempts to satisfy the new demands locally by adjusting the resource allooaj^JS^of VMs 
sharing the same VMM. Xen can change the CPU allocation among the VMs byAl|(^J|ting their 
weights in its CPU scheduler. The MM All otter on domain o of each nod^is ^sponsible for 
adjusting the local memory allocation. ^ 

liz^^tMn 

the hot threshold (i.e., a hot spot). If so, some VMs running on therft^H be migrated away to 
reduce their load. The cold spot solver checks if the average utilizat^^^^f actively used PMs (APMs) is 
below the green computing threshold. If so, some of those PMs co^ra^OTMntially be turned off to save 



The hot spot solver in our VM Scheduler detects if the resource utilii^^t^^^of any PM is above 



energy. It identifies the set of PMs whose utilization is below tra^i^ threshold (i.e., cold spots) and 
then attempts to migrate away all their VMs. It then compile^yytgration list of VMs and passes it 
to Usher CTRL for execution. 



3 The Skewnes^^E^rithm 





We introduce the concept of skewness to quantl^^e unevenness in the utilization of multiple 
resources on a server. Let n be the number G^^sburces we consider ri be the utilization of the ith 
resource. We define the resource skewnesyi^[ a server p as 

. Mir uuuuuuuuuuui 

Sli lliiilii^^ 

n 



Skewness dpP 

where r is the average ^ffl^^tion of all resources for server p. In practice, not all types 
of resources are p^S^sJknce critical and hence we only need to consider bottleneck 
resources in the abox^^l^cJHation. By minimizing the skewness, we can combine different types of 
workloads nicely amjfVimprove the overall utilization of server resources. In the following, we 
describe the d^ta^jfour algorithm. Analysis of the algorithm is presented in Section 1.. 



3.1 Hot and Cold Spots 

TJwmm 



Our a^!>wmm executes periodically t o evaluate the resource allocation status based on the 
g^^N^Sfc^d future resource demands of VMs. We define a server as a hot spot if the utilization 
o^^nf of its resources is above a hot threshold. This indicates that the server is overloaded 
anonence some VMs running on it should be migrated away. We define the temperature of a hot 
spot p as the square sum of its resource utilization beyond the hot threshold, move onto the 
next hot spot. Note that each run of the algorithm migrates away at most one VM 
from the overloaded server. This does not necessarily eliminate the hot spot, but at least 
reduces its temperature. If it remains a hot spot in the next decision run, the algorithm will 
repeat this process. It is possible to design the algorithm so that it can migrate away multiple 
VMs during each run. But this can add more load on the related servers during a period when 
they are already overloaded. We decide to use this more conservative approach and leave the 
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system some time to react before initiating additional migrations. 

3.2. Green Computing 

When the resource utiUzation of active servers is too low^, some of them can be turned off to 
save energy. This is w^here R is the set of overloaded resources in server p and rt is the hot 
threshold for resource r. (Note that only overloaded resources are considered in the calculation.) 
The temperature of a hot spot reflects its degree of overload. If a server is not a hot spot, its 
temperature is zero. 

We define a server as a cold spot if the utilizations of all its resources are below a cold thre 
indicatesthat the server is mostly idle and a potential candidate to turn off to M^P^nergy. 
However, we do so only when the average resource utilization of all actively usorii^lvers (i.e., 
APMs) in the system is below a green computing threshold. A server is actively i^ed r^jt has at least 
one VM running. Otherwise, it is inactive. Finally, we define the warm thr^j^U to be a level of 
resource utilization that is sufficiently high to justify having the server J^ifoo^ng but not so 
high as to risk becoming a hot spot in the face of temporary fluctuationO^pplication resource 
demands. 

Different types of resources can have different thresh- olds..^^^^mple, we can define the hot 
thresholds for CPU and memory resources to be 90 and 80 peN^I|^respectively. Thus a server is a 
hot spot if either its CPU usage is above 90 percent or its mentD^ usage is above 80 percent. 

o 

3.3 Hot Spot Mitttfajion 

We sort the list of hot spots in the system in desceiM^^temperature (i.e., we handle the hottest 
one first). Our goal is to eliminate all hot spots l^^pssible. Otherwise, keep their temperature as 
low as possible. For each server p, we first deci^Jap^ich of its VMs should be migrated away. We sort 
its list of VMs based on the resulting tempe(;atur^of the server if that VM is migrated away. We aim 
to migrate away the VM that can reduce J^^^^rver's temperature the most. In case of ties, we select 
the VM whose removal can reduce th^i^^gwness of the server the most. For each VM in the list, we 
see if we can find a destinatiorus^#v\j^o accommodate it. The server must not become a hot spot 
after accepting this VM. Among an^l^ch servers, we select one whose skewness can be reduced the 
most by accepting this VM. S^te tnat this reduction can be negative which means we select the 
server whose skewness incEjsa^g^the least. If a destination server is found, we record the migration 
of the VM to that server ViVut) date the predicted load of related servers. Otherwise, we move onto 
the next VM in the li^jNid try to find a destination server for it. As long as we can find a 
destination server f(^Q^y of its VMs, we consider this run of the algorithm a success and then 
handled in our g/?^Sromputing algorithm. The challenge here is to reduce the number of active 
servers duringN^i^oad without sacrificing performance either now or in the future. We need to 
avoid oscillail^^ki the system. 

Our gM«^3bmputing algorithm is invoked when the average utilizations of all resources on 
ac^if^iyjrers are below the green computing threshold. We sort the list of cold spots in the system 
Dai|edl)n the ascending order of their memory size. Since we need to migrate away all its VMs before 
we shut down an underutilized server, we define the memory size of a cold spot as the aggregate 
memory size of all VMs running on it. Recall that our model assumes all VMs connect to share back- 
end storage. Hence, the cost of a VM live migration is determined mostly by its memory footprint. 
Section 7 in the supplementary file explains why the memory is a good measure in depth. We try 
to eliminate the cold spot with the lowest cost first. 

For a cold spot p, we check if we can migrate all its VM somewhere else. For each VM on p, we try to 
find a destination server to accommodate it. The resource utilizations of the server after accepting 
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the VM must be below the warm threshold. While we can save energy by consolidating underutilized 
servers, overdoing it may create hot spots in the future. The warm threshold is designed to 
prevent that. If multiple servers satisfy the above criterion, we prefer one that is not a current cold 
spot. This is because increasing load on a cold spot reduces the likelihood that it can be eliminated. 
However, we will accept a cold spot as the destination server if necessary. All things being equal, we 
select a destination server whose skewness can be reduced the most by accepting this VM. If we can 
find destination servers for all VMs on a cold spot, we record the sequence of migrations and 
update the predicted load of related servers. Otherwise, we do not migrate any of its VMs. The 
list of cold spots is also updated because some of them may no longer be cold due to the p^on^Ried 
VM migrations in the above process. 

The above consolidation adds extra load onto the related servers. This is not as seriouii^Sroblem 



as in the hot spot mitigation case because green computing is initiated only whei^S^/oad in the 
system is low. Nevertheless, we want to bound the extra load due to servei^onS^lidation. We 
restrict the number of cold spots that can be eliminated in each run of the^^i^rtthm to be no 
more than a certain percentage of active servers in the system. This is calleijiVeJbnsolidation limit. 

Note that we eliminate cold spots in the system only when the avera^^jad of all active servers 
(APMs) is below the green computing threshold. Otherwise, w^lti^q^nose cold spots there as 




potential destination machines for future offloading. This is c^^Hg^lt with our philosophy that 
green computing should be conducted conservatively. ^J^^ 

4. Simulations 

We evaluate the performance of our algorithm uajj^^race driven simulation. Note that our 
simulation uses the same code base for the alfcqntnm as the real implementation in the 
experiments. This ensures the fidelity of our iffh^Jjation results. Traces are per minute server 
resource utilization, such as CPU rate, memG^;^wage, and network traffic statistics, collected using 
tools like "perfmon" (Windows), the "i^oc^ file system (Linux), "pmstat/vmstat/netstat" 
commands (Solaris), etc.. The raw traces^TkOTe-processed into "Usher" format so that the simulator 
can read them. We collected the tracea^A^i a variety of sources: 

• Web InfoMall. The 1 a r g ^I^^V^line Web archive in China (i.e., the counterpart of Internet 
Archive in the US) witMnore than three billion archived Web pages. 

• Real course. The laroggi^jjHine distance learning system in China with servers distributed 
across 13 major ci%^|^| 

• Amazing Store. a r g e s t P2P storage system in China. 




We also collected p™ar from servers and desktop computers in our university including one of our 
mail servers, J^^^tral DNS server, and desktops in our department. We post processed the 
traces basedfc^days collected and use random sampling and linear combination of the data sets to 
generate ife^vorldoads needed. All simulation in this section uses the real trace workload unless 
^rwiB^S^ecified. 

\^ 

^s^d the FUSD load prediction algorithm with " 0:2, # 0:7, and W 8. In a dynamic system, 
thoscparameters represent good knobs to tune the performance of the system adaptively. We 
choose the default parameter values based on empirical experience working with many Internet 
applications. In the future, we plan to explore using AI or control theoretic approach to find near 
optimal values automatically. 
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Figure 2 Impact of thresholds on the number of APMs 



We first evaluate the effect of the various thresholds used in our algorit^^^We simulate a system 
with 100 PMs and 1,000 VMs (selected randomly from the trace). WWaiBie ijiandom to PM mapping in 
the initial layout. The scheduler is invoked once per minute. Tij^^^^om part of Fig. 2 shows the 
daily load variation in the system. The x- axis is the time of tSl^l^ starting at 8 am. The y-axis is 
overloaded with two meanings: the percentage of the load^^M^jJP percentage of APMs (i.e., Active 
PMs) in the system. Recall that a PM is active (i.e., an APM) i^^has at least one VM running. As can 
be seen from the figure, the CPU load demonstrates dk^^ai patterns which decrease substantially 
after midnight. The memory consumption is fairly ^jJi^S^c over the time. The network utilization 
stays very low. 

The top part of Fig. 2 shows how the per^i^^Sges of APMs vary with the load for different 
thresholds in our algorithm. For exaiyfde, no.7 go.3 co.i" means that the hot, the green 
computing, and the cold thresholds ar^^^ jo, and 10 percent, respectively. Our algorithm can be 
made more or less aggressive in its i|fI§|2N:ion decision by tuning the thresholds. The figure shows 
that lower hot thresholds cause nfci^Vggressive migrations to mitigate hot spots in the system and 
increases the number of ARMs,^^fe higher cold and green computing thresholds cause more 
aggressive consolidation whicj^l^ds to a smaller number of APMs. The percentage of APMs in our 
algorithm follows the load x ^ ff JSm closely. 

To examine the ped^ijj^nce of our algorithm in more extreme situations, we also create a 
synthetic workload C|M^h mimics the shape of a sine function (only the positive part) and ranges 
from 15 to 95 l^^mt with a 20 percent random fluctuation. It has a much larger peak-to-mean 
ratio than tho^i^l^^ace. The results are shown in Section 2 of the supplementary file, which can 
be found oiOh^omputer Society Digital Library. 
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4.2 Scalability of the Algorithm 
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Figure 3. Scalability of the algorithm with system size. 



We evaluate the scalability of our algorithm by varying the number of 
between 200 and 1,400. The ratio of VM to PM is 10:1. The results are shown 



4n« 




e simulation 



Fig. 3. Fig. 3a shows that the average decision time of our algorithm mcr^^^f with the system size. 
The speed of increase is between linear and quadratic. We break down foe decision time into two 
parts: hot spot mitigation (marked as "hot") and green computini^ijmaplted as "cold"). We find that 
hot spot mitigation contributes more to the decision time. We ^^^HlJ^that the decision time for the 
synthetic workload is higher than that for the real trace due t^tme large variation in the synthetic 
workload. With 140 PMs and 1,400 VMs, the decision timq/^?Bout 1.3 seconds for the synthetic 
workload and 0.2 second for the real trace. a 



VMs, on average each run of our algorithjj 
synthetic workload and only 1.3 migra 




Vo* wh( 



Fig. 3b shows the average number of migrations imj^re whole system during each decision. The 
number of migrations is small and increases ro^)§H^ linearly with the system size. We find that 
hot spot contributes more to the number ftS^piigrations. We also find that the number of 
migrations in the synthetic workload is hign^than that in the real trace. With 140 PMs and 1,400 



urs about three migrations in the whole system for the 
the real trace. 
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We compare the execution of our algorithm with a«^vTthout load prediction in Fig. 4. When load 
prediction is disabled, the algorithm simply use^fti^^st observed load in its decision making. Fig. 
4a shows that load prediction significantly rec^mJlV the average number of hot spots in the system 
during a decision run. Notably, predictiorj^revents over 46 percent hot spots in the simulation 
with 1,400 VMs. This demonstrates its hi^i^ilectiveness in preventing server overload proactively. 
Without prediction, the algo- rithm JiiS^to consolidate a PM as soon as its load drops below the 
threshold. With prediction, tho^^feVy^hm correctly foresees that the load of the PM will increase 
above the threshold shortly and l^^nSp takes no action. This leaves the PM in the "cold spot" state for a 
while. However, it also redi«s placement churns by avoiding unnecessary migrations due to 
temporary load fluctuatioii^J^j^sequently, the number of migrations in the system with load 
prediction is smaller t^m^^^t without prediction as shown in Fig. 4c. We can adjust the 
conservativeness of loCjprediction by tuning its parameters, but the current configuration 
largely serves our p|mN^se (i.e., error on the side of caution). The only downside of having more 
cold spots in t^e^s^rem is that it may increase the number of APMs. This is investigated in Fig. 
4b which sho\ 
load predict 

protectioi^i^rTbe achieved without sacrificing resources efficiency. Fig. 6c compares the average 
numbjfl^Wnigrations per VM in each decision with and without load prediction. It shows that each 
Ylrf^i^iajAences 17 percent fewer migrations with load prediction. 



3w^(|^wrthe average numbers of APMs remain essentially the same with or without 
Im^he difference is less than 1 percent). This is appealing because significant overload 



5 Experiments 



Our experiments are conducted using a group of 30 Dell Power Edge blade servers with Intel 
E5620 CPU and 24 GB of RAM. The servers run Xen-3.3 and Linux 2.6.18. We deploy 8 VMs on 
each server at the beginning. Each VM is configured with one virtual CPU and two gigabyte 
memory. Self-ballooning is enabled to allow the hypervisor to reclaim un used memory. Each VM 
runs the server side of the TPC-W benchmark corresponding to various types of the workloads: 
browsing, shopping, hybrid workloads, etc. Our algorithm is invoked every 10 minutes. 
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5.1 Algorithm Effectiveness 
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Figure 5. #APMs varies wi 



PC-W load. 



We evaluate the effectiveness of our algorithm in^^rload mitigation and green computing. We 
start with a small scale experiment consisting c^mro PMs and five VMs so that w^e can present the 
results for all servers in Fig. 5. Different shad^i^Gre used for each VM. All VMs are configured w^ith 
128 MB of RAM. An Apache server runs oj^ach VM. We use httperf to invoke CPU intensive PHP 
scripts on the Apache server. This allow^!^^ subject the VMs to different degrees of CPU load by 
adjusting the client request rates. Th^^iplization of other resources are kept low^. We first increase 
the CPU load of the three VMs otii^Jli^o create an overload. Our algorithm resolves the overload 
by migrating VM3 to PM3. It r^cne^ stable state under high load around 420 seconds. Around 890 
seconds, w^e decrease the CP^OWd of all VMs gradually. Because the FUSD prediction algorithm is 
conservative w^hen the^qfflttefreases, it takes awhile before green computing takes effect. Around 
1,700 seconds, VM3 is rrf^Sred from PM3 to PM2 so that PM3 can be put into the standby mode. 
Around 2,200 secon^li^pre two VMs on PMi are migrated to PM2 so that PMi can be released as 
well. As the load^g^Wip and down, our algorithm will repeat the above process: spread over or 
consolidate th^4\4r as needed. 



Next we efcj\nW the scale of the experiment to 30 servers. We use the TPC-W benchmark for this 
experioi^lV TPC-W is an industry standard benchmark for e-commerce even w^hen idle, 
c^m^%^ several hundred megabytes of memory. After two hours, we increase the load 
(fil^nltically to emulate a "flash crowd" event. The algorithm wakes up the stand-by servers to 
offlOTa the hot spot servers. The figure shows that the number of APMs increases accord- ingly. After 
the request rates peak for about one hour, we reduce the load gradually to emulate that the flash 
crowd is over. This triggers green computing again to consolidate the underutilized servers. Fig. 5 
shows that over the course of the experiment, the number of APM rises much faster than it falls. This 
is due to the effect of our FUSD load prediction. The figure also shows that the number of APMs 
remains at a slightly elevated level after the flash crowd. This is because the TPC-W servers 
maintain some data in cache and hence its memory usage never goes back to its original level. 
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5.2 Impact of Live Migration 




Session Number 



Figure 6. Impact of live migration on TPC-W performance. 



One concern about the use of VM live migration is its impact on application p^ffllttnfince. Previous 
studies have found this impact to be small [5]. We investigate this impact in fcry own experiment. 
We extract the data on the 340 live migrations in our 30 server experiment We find that 139 of 

them are for hot spot mitigation. We focus on these migrations because^^k: is when the potential 
impact on application performance is the most. Among the 139 mi^a^oi^^we randomly pick seven 
corresponding TPC-W sessions undergoing live migration. All tb^^te^ions run the "shopping mix" 
workload with 200 emu- lated browsers. As a target for comp^j^^^ we rerun the session with the 
same parameters but perform no migra- tion and use the resull^^^^erformance as the baseline. Fig. 
6 shows the normalized Web interactions per second for the 7 sessions. WIPS is the 

performance metric used by TPC-W. The figure shows 1(hat^most live migration sessions exhibit no 
noticeable degradation in performance compared ti^p^aseline: the normalized WIPS is close to 

A^nanc 




the only exception is session 3 whose degraded p 
in the original experiment. Next we take a closel^ 
their performances vary over time. The figui^jpVifies 
performance degradation. The duration^f tire r 
algorithm is invoked every 10 minutes. 




'ance is caused by an extremely busy server 
at one of the sessions in Fig. 6 and show how 
that live migration causes no noticeable 
migration is under 10 seconds. Recall that our 
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Figure 7 Resource balance for mixed workloads 



Recall that the goal of the skewness algorithm is to mix workloads with different resource 
requirements together so that the overall utilization of server capacity is improved. In this 
experiment, we see how our algorithm handles a mix of CPU, memory, and network intensive 
workloads. We vary the CPU load as before. We inject the network load by sending the VMs a series 
of network packets. The memory intensive applications are created by allocating memory on 
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demand. Again we start with a small scale experiment consisting of two PMs and four VMs so that we 
can present the results for all servers in Fig. 7. The two rows represent the two PMs. The two 
columns represent the CPU and network dimensions, respectively. The memory consumption is 
kept low for this experiment. Initially, the two VMs on PMi are CPU intensive while the two VMs on 
PM2 are network intensive. We increase the load of their bottleneck resources gradually. Around 
500 seconds, VM4 is migrated from PM2 to PMi due to the network overload in PM2. Then around 
600 seconds, VMi is migrated from PMi to PM2 due to the CPU overload in PMi. Now the system 
reaches a stable state with a balanced resource utilization for both PMs — each with a CPU intensive 
VM and a network intensive VM. Later we decrease the load of all VMs gradually so that b*t 
become cold spots. We can see that the two VMs on PMi are consolidated to PM2 by ^ 
computing. ^ 

Next we extend the scale of the experiment to a group of 72 VMs running ove^S HM^^Half of the 
VMs are CPU intensive, while the other half is memory intensive. Initially, we kewwejoad of all VMs 
low and deploy all CPU intensive VMs on PM4 and PM5 while all memory inten^N^ vMs on PM6 and 
PMy. Then we increase the load on all VMs gradually to make the underlyii^^Ws hot spots. Fig. 12 
shows how the algorithm spreads the VMs to other PMs over time. As wf^i. see from the figure, the 
algorithm balances the two types of VMs appropriately. The figur^^lsc^Vnows that the load across 
the set of PMs becomes well balanced as we increase the load. 

6 Related Work 

Automatic scaling of Web applications was previousl}4 stumed in [14] and [15] for data center 
environments. In Muse [14], each server has replicas ^jf^SQ^^eb applications running in the system. 
The dispatch algorithm in a frontend L7-switch rM|ra^3ure requests are reasonably served while 
minimizing the number of underutilized serveA^5^T)rk [15] uses network flow algorithms to 
allocate the load of an application among its mfe^pg instances. 

6.1 Resource i^^^Sationby Live VM Migration 

VM live migration is a widely used^ahnique for dynamic resource allocation in a virtualized 
environment [8], [12], X^V 



Our work also belongs to thist^iegory. Sandpiper combines multidimensional load information 
into a single Volume meldcl^^ It sorts the list of PMs based on their volumes and the VMs in each PM 
in their volume-to-size^»pXVSR). This unfortunately abstracts away critical information needed 
when making the ro^^lgtron decision. It then considers the PMs and the Ms in the presorted order. 
The results are ^f?ltfced in Section 5 of the supplementary file, which is available online, to show 
how they behav^Aifi^rently. In addition, their work has no support for green computing and differs 
from ours in i^ifl^other aspects such as load prediction. Dynamic placement of virtual servers to 
minimize ttJ^Volations is studied in [12]. They model it as a bin packing problem and use the well- 
know^Al^^it approximation algorithm to calculate the VM to PM layout periodically. That 
alM^fci^, however, is designed mostly for offline use. It is likely to incur a large number of 
imgr^ions when applied in online environment where the resource needs of VMs change 
dyrMnically. 

6.2 Green Computing 

Many efforts have been made to curtail energy consumption in data centers. Hardware-based 
approaches include novel thermal design for lower cooling power, or adopting power- 
proportional and low-power hardware. 
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Our work belongs to the category of pure-software low- cost solutions [lo], [12], [14]. 

7. Conclusion 

We have presented the design, implementation, and evalua- tion of a resource management 
system for cloud computing services. Our system multiplexes virtual to physical resources 
adaptively based on the changing demand. We use the skewness metric to combine VMs with 
different resource characteristics appropriately so that the capacities of servers are well utilized. 
Our algorithm achieves both overload avoidance and green computing for systems with 
resource constraints. ' 
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